Add and Configure a Data Source in an Agent
Data sources enable your Agent to access specific knowledge, grounding its responses with relevant content. The Agent uses configured search settings and created indexes to retrieve information, generating more accurate outputs for user queries.Prerequisites
- A data source must be created and configured. For details, refer to Data Source Connectors.
Add a Data Source
To add a data source to your Agent:- While creating your Agent, drag and drop the desired data source from the Data Sources section in the left side panel into your Agent workflow.
Configure Search Settings
After adding a data source, configure its search behavior:- Select Files for the Agent (Optional) By default, the Agent retrieves data from the entire data source. To narrow the search to specific documents, click the Select files for this Agent button and choose the desired files.
-
Choose Search Type
Select the search type best suited for your use case:
Semantic Search
Semantic search retrieves information based on keyword matches and semantic similarity using vector search. This method is ideal for unstructured documents.- Max results: Specify the maximum number of results to retrieve and send to the LLM for generating output.
- Relevance threshold: Set a relevance similarity threshold. Only results with high relevance are returned to the LLM, helping to filter out noise.
- Neighboring chunks: In addition to the max results, specify the number of neighboring chunks to send to the LLM for richer context.
- Hybrid search: Blends results from keyword matching and semantic relevance retrieval. Adjust the slider to configure the weight between keyword and semantic relevance.
Hybrid search is only available if it was selected during the Vector database configuration when the data source was created. Refer to Configure Vector Database for more information.
Text-to-SQL Search
Text-to-SQL search is suitable for.csv
and.xlsx
files, especially when the data is primarily numerical and lacks deep semantic meaning. This method allows the Agent to generate SQL queries from natural language input to retrieve structured results.- Model Selection: Choose the LLM responsible for generating SQL queries within the Agent workflow.
Recommendation: For stable and accurate results, select “High Quality Capable” models.
- High Quality (best performance):
Claude 4 Sonnet
GPT 4.1
Claude 3.7 Sonnet
GPT 4o
- Sufficient Quality:
GPT 4.1 mini
Claude 3.5 Sonnet
GPT 4o mini
- High Quality (best performance):
- Fuzzy Search: Enable to allow the system to search through records even with misspellings in the user’s query.
Fuzzy search can increase query generation complexity.
Important: For both Semantic and Text-to-SQL search to function, indexes must be created and the data source configured during its creation. Check Ingestion settings for details.
Choosing the Right Search Method
The optimal search method depends on your query type, data structure, and desired outcome:- Use SQL Retrieval for:
- Structured files (
.csv
,.xlsx
). - Precise, structured queries.
- Efficiently answering qualitative questions.
- Data that is mostly numerical and does not have strong semantic meaning.
- Structured files (
- Use Semantic Retrieval for:
- Natural language queries.
- Unstructured or text-heavy documents.
- Cases requiring semantic understanding.